Fixes Issue #533 - Fix tuner dropping the first CLI argument. by hliu-ai · Pull Request #536 · vwxyzjn/cleanrl

hliu-ai · 2026-01-12T21:56:05Z

Description

Fixes #533.

cleanrl_utils/tuner.py constructed sys.argv as a flags-only list, so the first tuned flag landed in sys.argv[0]. When runpy executes the target script, Python sets sys.argv[0] to the script path, overwriting whatever argument was placed there. This had the effect of erroneously holding the first tuned hyperparameter at its default value instead of what was desired across trials.

This change preserves sys.argv[0] and appends the constructed flags:
sys.argv = [sys.argv[0]] + algo_command + [...]

Verification:

Reproduced locally on Windows, Python 3.10.11. Before the change, the first tuned hyperparameter (learning rate) fell back to the script default while later flags (e.g., gamma) applied.
After the change, learning rate is passed correctly and varies per trial as expected.
Commit id: 004f8a0

Types of changes

Bug fix
New feature
New algorithm
Documentation

Checklist:

I've read the CONTRIBUTION guide (required).
I have ensured pre-commit run --all-files passes (required).
I have updated the tests accordingly (if applicable).
I have updated the documentation and previewed the changes via mkdocs serve.
- I have explained note-worthy implementation details.
- I have explained the logged metrics.
- I have added links to the original paper and related papers.

If you need to run benchmark experiments for a performance-impacting changes:

I have contacted @vwxyzjn to obtain access to the openrlbenchmark W&B team.
I have used the benchmark utility to submit the tracked experiments to the openrlbenchmark/cleanrl W&B project, optionally with --capture_video.
I have performed RLops with python -m openrlbenchmark.rlops.
- For new feature or bug fix:
  - I have used the RLops utility to understand the performance impact of the changes and confirmed there is no regression.
- For new algorithm:
  - I have created a table comparing my results against those from reputable sources (i.e., the original paper or other reference implementation).
- I have added the learning curves generated by the python -m openrlbenchmark.rlops utility to the documentation.
- I have added links to the tracked experiments in W&B, generated by python -m openrlbenchmark.rlops ....your_args... --report, to the documentation.

vercel · 2026-01-12T21:56:10Z

@hliu-ai is attempting to deploy a commit to the Costa Huang's projects Team on Vercel.

A member of the Team first needs to authorize it.

hliu-ai · 2026-01-12T21:57:22Z

This is my very first ever PR so hopefully everything was done right! I'll continue looking for issues that I am capable of working on and keep learning.

Fix tuner dropping the first CLI argument.

bb8577d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes Issue #533 - Fix tuner dropping the first CLI argument.#536

Fixes Issue #533 - Fix tuner dropping the first CLI argument.#536
hliu-ai wants to merge 1 commit into
vwxyzjn:masterfrom
hliu-ai:fix-tuner-argv-drop

hliu-ai commented Jan 12, 2026

Uh oh!

vercel Bot commented Jan 12, 2026

Uh oh!

hliu-ai commented Jan 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hliu-ai commented Jan 12, 2026

Description

Types of changes

Checklist:

Uh oh!

vercel Bot commented Jan 12, 2026

Uh oh!

hliu-ai commented Jan 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant